New Primal SVM Solver with Linear Computational Cost for Big Data Classifications
نویسندگان
چکیده
Support Vector Machines (SVM) is among the most popular classification techniques in machine learning, hence designing fast primal SVM algorithms for large-scale datasets is a hot topic in recent years. This paper presents a new L2norm regularized primal SVM solver using Augmented Lagrange Multipliers, with linear computational cost for Lp-norm loss functions. The most computationally intensive steps (that determine the algorithmic complexity) of the proposed algorithm is purely and simply matrix-byvector multiplication, which can be easily parallelized on a multi-core server for parallel computing. We implement and integrate our algorithm into the interfaces and framework of the well-known LibLinear software toolbox. Experiments show that our algorithm is with stable performance and on average faster than the stateof-the-art solvers such as SVM , Pegasos and the LibLinear that integrates the TRON, PCD and DCD algorithms.
منابع مشابه
Multi-class Classification in Big Data
The paper suggests the on-line multi-class classi er with a sublinear computational complexity relative to the number of training objects. The proposed approach is based on the combining of two-class probabilistic classi ers. Pairwise coupling is a popular multi-class classication method that combines all comparisons for each pair of classes. Unfortunately pairwise coupling su ers in many cases...
متن کاملLinear SVM training using separability and interior point methods
Support vector machine training can be represented as a large quadratic program. We present an efficient and numerically stable algorithm for this problem using primaldual interior point methods. Reformulating the problem to exploit separability of the Hessian eliminates the main source of computational complexity, resulting in an algorithm which requires only O(n) operations per iteration. Ext...
متن کاملA truncated primal-infeasible dual-feasible network interior point method
In this paper, we introduce the truncated primal-infeasible dual-feasible interior point algorithm for linear programming and describe an implementation of this algorithm for solving the minimum cost network flow problem. In each iteration, the linear system that determines the search direction is computed inexactly, and the norm of the resulting residual vector is used in the stopping criteria...
متن کامل: Primal Estimated sub - GrAdient SOlver for SVM
We describe and analyze a simple and effective stochastic sub-gradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracy is Õ(1/ ), where each iteration operates on a single training example. In contrast, previous analyses of stochastic gradient descent methods for SVMs r...
متن کاملPresented a method for estimating the cost of software using PCA to reduce the size and with the help of data mining
These days, data mining one of the most significant issues. One field data mining is a mixture of computer science and statistics which is considerably limited due to increase in digital data and growth of computational power of computer. One of the domains of data mining is the software cost estimation category. In this article, classifying techniques of learning algorithm of machine ...
متن کامل